tangent distance
A Robust Prototype-Based Network with Interpretable RBF Classifier Foundations
Saralajew, Sascha, Rana, Ashish, Villmann, Thomas, Shaker, Ammar
Prototype-based classification learning methods are known to be inherently interpretable. However, this paradigm suffers from major limitations compared to deep models, such as lower performance. This led to the development of the so-called deep Prototype-Based Networks (PBNs), also known as prototypical parts models. In this work, we analyze these models with respect to different properties, including interpretability. In particular, we focus on the Classification-by-Components (CBC) approach, which uses a probabilistic model to ensure interpretability and can be used as a shallow or deep architecture. We show that this model has several shortcomings, like creating contradicting explanations. Based on these findings, we propose an extension of CBC that solves these issues. Moreover, we prove that this extension has robustness guarantees and derive a loss that optimizes robustness. Additionally, our analysis shows that most (deep) PBNs are related to (deep) RBF classifiers, which implies that our robustness guarantees generalize to shallow RBF classifiers. The empirical evaluation demonstrates that our deep PBN yields state-of-the-art classification accuracy on different benchmarks while resolving the interpretability shortcomings of other approaches. Further, our shallow PBN variant outperforms other shallow PBNs while being inherently interpretable and exhibiting provable robustness guarantees.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Efficient Computation of Complex Distance Metrics Using Hierarchical Filtering
By their very nature, memory based algorithms such as KNN or Parzen windows require a computationally expensive search of a large database of prototypes. In this paper we optimize the search(cid:173) ing process for tangent distance (Simard, LeCun and Denker, 1993) to improve speed performance. The closest prototypes are found by recursively searching included subset.s of the database using dis(cid:173) tances of increasing complexit.y. This is done by using a hierarchy of tangent distances (increasing the Humber of tangent. At each stage, a confidence level of the classification is computed.
Learning Prototype Models for Tangent Distance
Simard, LeCun & Denker (1993) showed that the performance of nearest-neighbor classification schemes for handwritten character recognition can be improved by incorporating invariance to spe(cid:173) the so cific transformations in the underlying distance metric - called tangent distance. The resulting classifier, however, can be prohibitively slow and memory intensive due to the large amount of prototypes that need to be stored and used in the distance compar(cid:173) isons. In this paper we develop rich models for representing large subsets of the prototypes. These models are either used singly per class, or as basic building blocks in conjunction with the K-means clustering algorithm.
A Constructive Learning Algorithm for Discriminant Tangent Models
To reduce the computational complexity of classification systems using tangent distance, Hastie et al. rithm to devise rich models for representing large subsets of the data which computes automatically the "best" associated tan(cid:173) gent subspace. We propose a gradient based constructive learning algorithm for building a tangent subspace model with discriminant capabilities which combines several of the the advantages of both HSS and Diabolo: devised tangent models hold discriminant capabilities, space requirements are improved with respect to HSS since our algorithm is discriminant and thus it needs fewer prototype models, dimension of the tangent subspace is determined automatically by the constructive algorithm, and our algorithm is able to learn new transformations.
Classification by Set Cover: The Prototype Vector Machine
Bien, Jacob, Tibshirani, Robert
We introduce a new nearest-prototype classifier, the prototype vector machine (PVM). It arises from a combinatorial optimization problem which we cast as a variant of the set cover problem. We propose two algorithms for approximating its solution. The PVM selects a relatively small number of representative points which can then be used for classification. It contains 1-NN as a special case. The method is compatible with any dissimilarity measure, making it amenable to situations in which the data are not embedded in an underlying feature space or in which using a non-Euclidean metric is desirable. Indeed, we demonstrate on the much studied ZIP code data how the PVM can reap the benefits of a problem-specific metric. In this example, the PVM outperforms the highly successful 1-NN with tangent distance, and does so retaining fewer than half of the data points. This example highlights the strengths of the PVM in yielding a low-error, highly interpretable model. Additionally, we apply the PVM to a protein classification problem in which a kernel-based distance is used.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Government > Regional Government (0.46)
Multiresolution Tangent Distance for Affine-invariant Classification
Vasconcelos, Nuno, Lippman, Andrew
The ability to rely on similarity metrics invariant to image transformations is an important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multi resolution tangent distance, which exhibits significantly higher invariance to image transformations, and can be easily combined with robust estimation procedures.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Multiresolution Tangent Distance for Affine-invariant Classification
Vasconcelos, Nuno, Lippman, Andrew
The ability to rely on similarity metrics invariant to image transformations is an important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multi resolution tangent distance, which exhibits significantly higher invariance to image transformations, and can be easily combined with robust estimation procedures.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Multiresolution Tangent Distance for Affine-invariant Classification
Vasconcelos, Nuno, Lippman, Andrew
The ability to rely on similarity metrics invariant to image transformations isan important issue for image classification tasks such as face or character recognition. We analyze an invariant metric that has performed well for the latter - the tangent distance - and study its limitations when applied to regular images, showing that the most significant among these (convergence to local minima) can be drastically reduced by computing the distance in a multiresolution setting. This leads to the multiresolution tangent distance, which exhibits significantly higher invariance to image transformations,and can be easily combined with robust estimation procedures.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
A Constructive Learning Algorithm for Discriminant Tangent Models
Sona, Diego, Sperduti, Alessandro, Starita, Antonina
To reduce the computational complexity of classification systems using tangent distance, Hastie et al. (HSS) developed an algorithm to devise rich models for representing large subsets of the data which computes automatically the "best" associated tangent subspace. Schwenk & Milgram proposed a discriminant modular classification system (Diabolo) based on several autoassociative multilayer perceptrons which use tangent distance as error reconstruction measure. We propose a gradient based constructive learning algorithm for building a tangent subspace model with discriminant capabilities which combines several of the the advantages of both HSS and Diabolo: devised tangent models hold discriminant capabilities, space requirements are improved with respect to HSS since our algorithm is discriminant and thus it needs fewer prototype models, dimension of the tangent subspace is determined automatically by the constructive algorithm, and our algorithm is able to learn new transformations.
A Constructive Learning Algorithm for Discriminant Tangent Models
Sona, Diego, Sperduti, Alessandro, Starita, Antonina
To reduce the computational complexity of classification systems using tangent distance, Hastie et al. (HSS) developed an algorithm to devise rich models for representing large subsets of the data which computes automatically the "best" associated tangent subspace. Schwenk & Milgram proposed a discriminant modular classification system (Diabolo) based on several autoassociative multilayer perceptrons which use tangent distance as error reconstruction measure. We propose a gradient based constructive learning algorithm for building a tangent subspace model with discriminant capabilities which combines several of the the advantages of both HSS and Diabolo: devised tangent models hold discriminant capabilities, space requirements are improved with respect to HSS since our algorithm is discriminant and thus it needs fewer prototype models, dimension of the tangent subspace is determined automatically by the constructive algorithm, and our algorithm is able to learn new transformations.